Utility-Embraced Microaggregation for Machine Learning Applications

نویسندگان

چکیده

With access to vast amounts of data, privacy protection is more important than ever. Among various de-identification (anonymization) techniques, k-anonymous microaggregation has been widely studied since it enables us balance between confidentiality and data utility. Despite plenty methods in the sense reducing information loss and/or computational complexity, machine learning (ML) models using resulting aggregated face problem that they are not as effective expected. Motivated by fact ML can be heavily influenced distorted training (albeit slightly), we deliberate on performance terms only but also data utility. In this paper, propose Util-MA, a new utility-embraced framework for applications. Specifically, unlike prior studies apply techniques directly raw design unified potentially enhance utility while preserving k-anonymity through preprocessing steps including dimensionality reduction clustering. By real-world datasets, empirically demonstrate superiority Util-MA over benchmark classification accuracy. Moreover, investigate importance measuring key indicators (KPIs) clustering; clustering stage leads high when results substantially coincide with ground truth labels. We establish close relationship KPIs accuracies, which tends revealed there gain method observed. Our microaggregation-model-agnostic; thus, underlying appropriately chosen according one’s needs tasks.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Machine Learning for Medical Applications

Machine learning (ML) has been well recognized as an effective tool for researchers to handle the problems in signal and image processing.Machine learning is capable of offering automatic learning techniques to excerpt common patterns from empirical data and then make sophisticated decisions, based on the learned behaviors. Medicine has a large dimensionality of data and the medical application...

متن کامل

Machine learning for medical applications

Machine learning has been well applied and recognized as an effective tool to handle a wide range of real situations, including medical applications. In this scenario, it can help to alleviate problems typically suffered by researchers in this field, such as saving time for practitioners and providing unbiased results. This tutorial is concerned with the use of machine learning techniques to so...

متن کامل

Web mining: Machine learning for web applications

Introduction With more than two billion pages created by millions of Web page authors and organizations, the World Wide Web is a tremendously rich knowledge base. The knowledge comes not only from the content of the pages themselves, but also from the unique characteristics of the Web, such as its hyperlink structure and its diversity of content and languages. Analysis of these characteristics ...

متن کامل

Machine Learning Theory and Applications for Healthcare

Department of Electronics and Communication, University of Allahabad, Allahabad, India School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology, Gwangju, Republic of Korea Department of Computer Science and Engineering, School of Engineering and Computer Science, Oakland University, Rochester, MI, USA School of Electrical and Automatic Engineering, Chan...

متن کامل

Data Quantization Optimized for Machine Learning Applications

The goal of machine learning applications is to find a classification function, y(x), for a test point x ∈ χ, given a training dataset {(x , t ), 1 ≤ n ≤ N} in order to minimize some optimality criterion. In practice, the collection of data and feature extraction may not occur at the same time or in the same place as the application of the classification function. For instance, the user may wan...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEEE Access

سال: 2022

ISSN: ['2169-3536']

DOI: https://doi.org/10.1109/access.2022.3183201